# RESEARCH ARTICLE

OPEN ACCESS

# Design for Arithmetic Based Dynamic Binary-To-Rns Conversion Using Multi modulo Operations

Ganji Sarikayadav<sup>1</sup>, Dr.V.Thrimurthulu<sup>2</sup>, S.Ali Asgar<sup>3</sup>

<sup>1</sup>II M.Tech VLSI SD Student, CR EngineeringCollege, Tirupathi, Chittoor(Dist) A.P, India, <sup>2</sup>Professor, Head of ECE Dept., CR Engineering College, Tirupathi, Chittoor (Dist) A.P, India, <sup>3</sup>Assistant Professor of ECE Dept., CREngineering College, Tirupathi, Chittoor(Dist) A.P, India, <sup>1</sup>ganjisarika@gmail.com, <sup>2</sup>vtmurthy.v@gmail.com, <sup>3</sup>aliasgarsyed@gmail.com

## Abstract

To design a new ROMLESS (Read-Only-Memory less) structure for binary-to-RNS (Residue Number System) conversion using modulo  $\{2^n\pm k\}$  is proposed. The proposed read-only-memoryless structure is only based on constant multipliers and adders. The development structure is efficient for larger values of n and existing system is inefficient for larger values of n. Experimental results obtained for an proposed conversion structures significantly improve the forward conversion efficiency with at metric improvement above 100% regarding the related state of art .Delay improvements 2.17 times with less area can be achieved if a proper selection of the  $\{2^n \pm k\}$  moduli is performed.

Keywords: Arithmetic, binary-to-RNS, forward conversion, residue number system.

# I. INTRODUCTION

In this paper Residue Number System is unconventional and non- Weighted Number System additions, subtractions in which the and multiplication are inherently carry free. This increase in calculation speed and a reduction in its power consumption.RNS uses remainders to represent numbers. Its main characteristics offer the potential for high-speed and parallel processing based on carry-free arithmetic [1]. This defined by the moduli set that supports each particular RNS. The RNS moduli set is set up by defining the moduli, where mi representspositive relatively prime integers. A number X is represented in RNS by its residues  $x_i = \langle X \rangle_{mi}$ , where  $x_i$  is theremainder of the division of X by mi . Conversion from weightednumber system to Residue Number System (forward conversion or binary -to-RNS), and vice versa (reverse conversion or RNS-to-binary), is required inorder to implement a complete RNS-based processing system. RNS is basically used on computational intensive applications such as filtering, convolution, digital signal processing correlation, fast Fourier transform computation and cryptography [2]-[4].

First research was done on Residue Number System was mainly focused on the three modulus set  $\{2^{n}-1, 2^{n}, 2^{n}+1\}$  [5].Recently, different RNS moduli sets have been proposed in order to increase the dynamic range (DR) and/or reduce the width of the RNS channels, such as  $\{2^{n}-3, 2^{n}-1, 2^{n}+1, 2^{n}+3\}$  in [6] and [7],  $\{2^{n}-1, 2^{n+\beta}, 2^{n}+1\}$  in [8],  $\{2^{n}-1, 2^{n}, 2^{n}+1, 2^{n+1}+1, 2^{n+1}\}$  in [10]. More recently, the moduli set with a DR up to (8n + 1) bits has been proposed.

The moduli  $\{2^n\pm k\}$  with unrestricted k values, is useful to obtain larger Residue Number System moduli sets [4], resulting in circuits with improved metrics. Larger moduli sets for the same DR, the operands are smaller, and more compact arithmetic units are achieved, with circuit area requirements and reduced delay. Most of the forward converters presented in the state of the art are limited in terms of the number of bits per channel are unable to scale for larger DR, given their exponential growth with the number of bits per channel, mostly having structures based on LUT (lookup tables), such as ROMs (read only memories).

The proposed structures are based on weighted reduction, considering a serial-parallel, serial or fully parallel approaches. Considering the periodicity property was proposed. The periodicity property can be used only to improve the conversion when shorter values are found for modulo  $\{2^n \pm k\}$ . More recently, proposed a novel conversion structure in that overcomes this dependency using the distributive property instead of periodicity, allowing for unrestricted modulo values. In proposed multimoduli architectures, using a weight selection algorithm, with binary additions and ROMs. These architectures allow the same arithmetic operations for different moduli within the same structure, but suffer from the same lack of scalability for larger DR. Furthermore, since the brief herein presented is focused on simple single-modulo conversionstructures, these are not herein considered. In the analysis and implementation

of a computer-aided design (CAD) tool is presented, capable of generating a structural description of binary-to- RNS converters, for a DR up to 21 bits. The proposed method actually allows achieving forward converters for any DR, as herein shown.

A novel ROM-less generic forward conversion structure for a DR of m = jn-bit, using  $\{2^{n}\pm k\}$ moduli, is proposed, considering  $n \ge 2$ . The proposed approach splits the jn input bits into j input sets, and computes the respective residue value using modular additions and constant multiplications. The use of constant multipliers in the proposed scheme does not impose the exponential area increase as the ROM-based topologies proposed in the related state of the art. To evaluate the performance of the proposed structure, the experimental results were obtained. These results suggest that the proposed forward conversion topology allows improvements of 85% in area and 15% in delay when compared with the ROM-based modulo  $\{2^n \pm k\}$  binary-to-RNS converters proposed. Moreover, improvements up to 50% on delay can be achieved when comparing with theROM-less-based converters proposed with a minimal increase in circuit area resources, 10% higher on average.

This brief discussion as follows. Section II introduces the formulation adopted to design the modulo  $\{2^{n}\pm k\}$  binary-to-RNS conversion structure described in Section III. Section IV presents the experimental results, and compares the proposed topology with the related state of the art. The conclusion is presented in Section V.

## **II. RELATEDWORK**

Considering a integer X with m-bit inputs, represented as  $\{X_{[m^{-1}]}, \ldots, X_{[1]}, X_{[0]}\}$  with

$$X = \sum_{i=0}^{m-1} 2^{i} \cdot X_{[i]}$$

a forward converter for modulo  $\{2^{n}\pm k\}$ transforms X into a residue value r with  $w_{mod}$  bits,  $\{r_{[wmod-1]},\ldots,r_{[1]},r_{[0]}\}$ , with  $w_{mod} = [log_2\{2^{n}\pm k\};w_{mod} = n$  for modulo  $\{2^{n}-k\}$  and  $w_{mod} = n+1$  for modulo  $\{2^{n}+k\}$ , when  $0 < k < 2^{n}$ . The residue modulo  $\{2^{n}\pm k\}$ ;of the input value X can be achieved by computing the integer division of X by  $\{2^{n}\pm k\}$ ,but it is a costly operation to obtain the remainder. The approach hereinconsidered to derive(X) $_{2\ \pm k}^{n}$  is based solely

on simple modular arithmetic operations. Given the propriety

$$\langle 2^{n} \rangle_{2}^{n}{}_{\pm k} = \langle 2^{n} \pm k \mp k \rangle_{2}^{n}{}_{\pm k} = \langle \mp k \rangle_{2}^{n}{}_{\pm k}$$
(1)

and with  $X_{\upsilon} \equiv X_{[(\upsilon+1)} \cdot n-1:\upsilon \cdot n]$ , and considering X as the binary representation of an integer to be converted, with a DR of jnbits, where  $X_{[msb:lsb]}$  represents the msb to lsb bits of integer X. The residue modulo  $\{2^{n}-k\}$  of the value X with jn bits can be Computed as  $\langle X \rangle_{2}^{n}$ 

$$\begin{array}{l} {}_{-k} = \langle 2^{(j-1)n} X_{[jn-1:(j-1)n]} + \dots + 2^{2n} X_{[3n-1:2n]} \\ & + 2^n X_{[2n-1:n]} + X_{[n-1:0]} \rangle_2^{n} _{-k} \\ & = \langle 2^{(j-1)n} X_{j-1} + \dots + (2^n - k + k)(2^n - k + k)X_2 \\ + (2^n - k + k)X_1 + X_0 \rangle_2^{n} _{-k} \\ & = \langle k^{j-1} X_{j-1} + \dots + k^2 X_2 + k X_1 + \\ & X_0 \rangle_2^{n} _{-k} \\ & = \langle \sum_{i=0}^{j-1} k^i X_i \rangle_2^{n} _{-k} \tag{2}$$
Identically, the residue modulo  $\{2^n + k\}$  of X can be computed as

$$\begin{aligned} \langle X \rangle_{2}^{n}{}_{+k} &= \langle -k^{j-1} X_{j-1} + \dots + k^{2} X_{2} - k X_{1} + X_{0} \rangle_{2}^{n}{}_{+k} \\ &= \langle \sum_{i=0}^{j-1} k^{2i} X_{2i} - \sum_{i=0}^{j-2} k^{2i+1} X_{2i+1} \rangle_{2}^{n}{}_{+k}(3) \end{aligned}$$
  
The modular subtractions in (3) can be computed as

 $\langle -X_{l} \rangle_{2}^{n}{}_{+k} = \langle 2^{n} - 1 - X_{l} + k + 1 \rangle_{2}^{n}{}_{+k} = \langle \overline{\mathbf{X}}_{l} + k + 1 \rangle_{2}^{n}{}_{+k}$  (4)

Given (4), (3) can be rewritten as

Where

$$\sum_{i=0}^{\left\lfloor \frac{j-1}{2} \right\rfloor} k^{2i+1} (k+1) \sum_{k=0}^{n} k^{2i+1} (k+1)$$

is a constant (cst).

# **III. HARDWARE STRUCTURES**

To the best of the author's knowledge, the existing generic  $modulo\{2^n\pm k\}$  binary-to-RNS converters are based on the weight-selection approach.

In the first approach proposed in each n-bit segment of the binary input value is passed through a lookup table, outputting the corresponding residue value. Each table has an n-bit input and ann-bit output. The outputs of these LUTS(lookup tables) are then added byan adder-tree structure and further reduced modulo  $\{2^{n}\pm k\}$  by an additional lookup table and a final  $\{2^{n}\pm k\}$  modulo adder. This conversion structure is referred as Piestrak. There is exponentially area increases of the lookup tables, implemented byROMs, this structure is only efficient for small values of n. In a conversion structure considering weight selection based only onmodular adders and multiplexers is proposed, herein referred to as Prem kumar. The same author extend his work using the periodicity property, which can also be employed in the remainingstate of the art. However, this technique can only be used to improve the conversion when shorter period values for modulo  $\{2^n \pm k\}$  can be found. More recently, a new converter was proposed allowing the best conversion implementations for DR up to 64 bits with moduloup to 6 bits. This architecture is based on a depth-bounded carrysaveaddition, with a carry-free reduction of the sum

Chadalawada Ramanamma Engineering College

International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 National Level Technical Symposium On Emerging Trends in Engineering & Sciences (NLTSETE&S- 13<sup>th</sup> & 14<sup>th</sup> March 2015)

The final reduction and carry vectors. is accomplished using a modified modulo adderand a lookup table. For larger moduli and DRs, the usage of thelookup table makes it once more not scalable. In another weightselection-based conversion approach is proposed. However, in this case, the resulting structure is based solely on full adders, allowingit to scale to larger moduli channels and DRs. The authors also made available a CAD tool to facilitate the implementation of theproposed conversion structures. However, the developed toolonly considers DR up to 21 bits. In order to obtain conversionstructures for the considered DRs, an augmented version of thisCAD tool was developed, considering the proposed method. Theversion implemented by us now allows for the unrestricted DR, andwas designed to use the same optimized arithmetic structures used in the herein proposed conversion structures. The conversions structuresgenerated by the new tool are herein referred as Soudris.

Given the inadequacy of most of the related state-of-the-artstructures, for larger values of n, two new generic binary-to-RNSconversion structures are herein proposed for modulo  $\{2^n \pm k\}$ , only requiring arithmetic operations. These conversion units only requireconstant multiplication and addition units: 1) Proposed I structure targets a more compact structure with a serialized modular reductionapproach, which is an extension of the previous work presented but allowing now to compute X modulo  $\{2^n \pm k\}$  with DR of jn bits instead of the previous 4n bits; 2) Proposed Ilstructure targets faster conversion considering a parallel approach.In these proposed structures, a restriction to the width of  $(\omega_k)$  is considered, allowing to simplifying the modular reduction step, namely  $\omega_k \equiv [\log_2 k] \le n/2$ . Even with this restriction of  $\omega_k$ , the  $k^{i}$  constant value in (2) and (3) can be greater than  $2^{n}$  $\pm$ kfor i  $\geq$  2, since for k <2<sup>n/2</sup> results in k<sup>i</sup><2<sup>n·i/2</sup>.In modulo the resulting thesecases constant  $\langle k^i \rangle_{2n\pm k} can$ be precomputedand reduced modulo{2<sup>n</sup>±k}, providing a constant value with  $\omega_k^{-1}$ bits, where  $\omega_k^i = [\log_2(\langle k^i \rangle_{2n\pm k})]$ . The maximum width of k values is represented by  $\omega_{kmax} = max(\omega_{ki})$ ),  $\forall 0 \leq i \leq j-1$ .

Proposed II structure for modulo  $\{2^n \pm k\}$  forward conversion is based on a parallel approach. In this approach, each partial operation  $(k^i \cdot X_i)$  is first reduced modulo  $\{2^n \pm k\}$ , and only then added to obtain the final residue value. Thus, computation is performed in two stages: the first stage computes the constant multiplication vectors and the second stage performs the addition of all those vectors. This structure performs the modular reduction of each calculation, using the modulo  $\{2^n \pm k\}$  Carry-Save Adder instead of adding all terms and reducing then iteratively at the end, as in Proposed I. The considered modular k<sup>i</sup>constant multiplication block computes

$$\langle \langle \mathbf{k}^{i} \rangle_{2}^{n} \pm \mathbf{k}^{i} \mathbf{X}_{i} \rangle_{2n\pm k} as \langle \langle \mathbf{k}^{i} \rangle_{2}^{n} \pm \mathbf{k}^{i} \mathbf{X}_{i} \rangle_{2}^{n} = \langle \mathbf{k}^{i} [\boldsymbol{\omega}_{\mathbf{k}^{i-1:0}} \mathbf{X}_{[(i+1)\cdot n-1:i\cdotn]} \rangle_{2}^{n} \pm \mathbf{k} = \langle \mathbf{P}^{1}_{[n+} \boldsymbol{\omega}_{\mathbf{k}^{i-1:0}} \rangle_{2}^{n} \pm \mathbf{k} = \langle 2^{n} \cdot \mathbf{P}^{1}_{[n+} \boldsymbol{\omega}_{\mathbf{k}^{i-1:n}} + \mathbf{p}_{[\mathbf{n}-1:0]}^{1} \rangle_{2}^{n} \pm \mathbf{k} = \langle \overline{\mp} \mathbf{k} \cdot \mathbf{P}^{1}_{[n+} \boldsymbol{\omega}_{\mathbf{k}^{i-1:n}} + \mathbf{p}_{[\mathbf{n}-1:0]}^{1} \rangle_{2}^{n} \pm \mathbf{k} = \langle \mathbf{P}^{2} [\boldsymbol{\omega}_{\mathbf{k}^{i}} \omega_{\mathbf{k}^{i-1:n}} + \mathbf{p}_{[\mathbf{n}-1:0]}^{1} \rangle_{2}^{n} \pm \mathbf{k}$$



Fig. 1. Proposed II binary-to-RNS converter modulo  $\{2^{n}\pm k\}.$ 

However, when  $\omega_k{}^i+\omega_k\!\!>n,$  the value  $P^2$  has more than n bits,requiring an additional reduction step, computed as

 $\begin{array}{l} \langle \langle \kappa^{\iota} \rangle_{2\nu\pm\kappa} \cdot \Xi_{\iota} \rangle_{2\nu\pm\kappa} \\ = \langle \overline{+}\kappa \cdot \Pi^{2}[\omega_{\kappa}^{\ \iota} + \omega_{\kappa-1:\nu]} + \Pi^{2}[_{\nu-1:0]} + \Pi^{1}[_{\nu-1:0]} \rangle_{2}^{\nu} \pm \kappa \\ = \langle \Pi^{3}[\omega_{\kappa}^{\ \iota} + 2\omega_{\kappa} - \nu - 1:0]] + \Pi^{2}[_{\nu-1:0]} + \\ \Pi^{1}[_{\nu-1:0]} \rangle_{2}^{n} \pm k(7) \end{array}$ 

Therefore, the  $k^i constant$  modular multiplication can be implemented with two constant multipliers when  $(\omega_{ki}+\omega_k\)\leq n,$  or with an additional constant multiplier and a 3:2 modular compressor ,when $(\omega_k^i+\omega_k\)>n$ . The resulting modular  $k^i constant$  multiplication block is shown in Fig. 1. The additional resources for the more complex implementation case are represented with dashed lines. After the computation of  $\langle \ k^i\ \rangle_{2n\pm k}$ , the resulting carry-and save vectors are feed into a modular adder tree to obtain the final result  $\langle \ X\ \rangle_{2^n\pm k}^n$ . In the particular case of the binary-to-RNS modulo  $\{2^n+k\}$ , an additional input is required in the modular adder tree to compute the addition of the correction factor cst, from (6).

#### **IV. EXPERIMENTAL RESULTS**

In order to evaluate fully the proposed binary-to-RNS converters and the related state of the art, all structures were described inVery High Speed Integrate Circuits Hardware Description Language

Chadalawada Ramanamma Engineering College

and mapped to an application-specified integrated circuit technology, in particular for the United Microelectronics Corporation (UMC) 0.13-µm CMOS technology from UMC . The ROM results have been obtained by synthesizing the available ROM sizes in thesynchronous via-1 ROM compiler for the UMC 0.13-µm high-speed logic process technology, and the others were estimated based on the real obtained values. Both synthesis and mapping results were performed using Design Vision E-2010.12-SP4 from Synopsys.

In order to take into account the impact of different values for n and k, the presented experimental results were obtained for a variation of n  $\epsilon$  [6, 32].For the value k, the best case is obtained for k = 3, and for the worst case, the values k=2<sup>n/2</sup>-1 and  $\langle k^i \rangle_{2^{\frac{n}{2}\pm k}}^{2^n}=2^n-1$  are considered. The worst case value for k correspond to the worst case scenario for Proposed I and Proposed II structures implying the most complex and costly multipliers. Since for the Soudris structure the worst case scenario cannot be easily determined, and to simplify the analysis, the same value of k, k = 2<sup>n/2</sup>-1, is used as the worst case value. For the Premkumar structure, the presented values are for k = 3, since no significant variations occur for other values of k.

In order to properly evaluate the cost of the several conversion structures, Residue Number System with four and eight moduli channels are considered, resulting in a DR of 4n and 8n, respectively. The obtained experimental results for circuit area and delay are shown in Fig. 2(a), (b),(d), and (e).

As expected from the theoretical analysis the ROM-based conversion structure proposed imposes significantly worst area metrics even for small DR. As the DR increases, and with it the size of the ROM, the area of ROM-based conversion structures significantly increases, due to the exponential increase of the ROM area. The Premkumar structure has worst area metrics, as the Piestrak converter, for small modulo and DR. However, the Premkumar structure has less area than the Piestrak for larger modulo, for modulo channels with more than 10 bits. Nevertheless, the Premkumar structure requires up to 6.85 times more area resources than Proposed II.

The Soudris conversion structure always requires more area resource for k = 3 when compared with Proposed II. However, when considering the worst case of k, the Soudris converters require less circuit area than the proposed converters. Note that the worst case scenario for the herein proposed conversion structures is not realistic since itconsiders that all the constant multipliers are in the most complexform. Still, less area demanding conversion units can be obtained with the proposed conversion structures, when compared with therelated state of the art, in

particular for larger values of n. Given the parallel approach of Proposed II structure, when compared withProposed I, slightly higher area requirements are observed. From theobtained area results, it can also be concluded that the area increases with the number of moduli channels ( j ), since more constant multipliersand adders are needed, in particular for the more parallelProposed II structure. The obtained experimental results suggest thatthe ROM-based related art (Piestrak) can in fact be faster, but onlyfor values of n up to 18 bits. The modular adder-based conversionstructure from Premkumar has worst delay metric than Piestrak forn up to 18 bits, but achieves better delay metrics for n greater than20 bits. The Soudris conversion structure has the worst delay forthe considered structures, nevertheless for n greater than 22 bitsbetter delay metrics than the Piestrak conversion structure can be achieved. Regarding the proposed conversion structures, Proposed IIconversion structure is able to achieve lower delay metrics than therelated state of the art for n as low as 8 bits. Proposed II structure is always faster for n greater than 18 bits, even for the worst caseof k, for acceptable area costs. The delay improvement inversion pointdepends on the number of modulo channels ( j ) and the consideredk values. For example, the Premkumar converter allows to achievebetter delay metric regarding ours in the worst cases. However, requiring on average 5.42 times more area resources. On average, the obtained results, for k = 3 and for the worst value of k, suggest that Proposed I structure requires on average 91%-87% less area, and is on average 10% faster to 16% slower than the Piestrakand Premkumar conversion structures, respectively. Regarding theSoudris structure, it allows for a 12% faster conversion at a costof 44% more area resources. The experimental results suggest thatProposed II structure is on average 85%-80% smaller, and 30%-5% faster than the Piestrak and the Premkumar structures, respectively.When compared with the Soudris structure, the experimental results suggest that Proposed II structure is on average 40% faster at acost of 56% more circuit area. Note that this average is obtained from k = 3 and k = worst case, which does not allow to fully interpret the resulting metrics. To better understand the variation of these metrics with the variation of the k value, the Premkumar, the Soudris, and Proposed II conversion structures were synthesized for all possible k values for n= 20, considering a DR of 4n bitsand modulo  $\{2^n - k\}$ , as shown in Fig. 2(c) and (f). From these results, it can be seen that the Premkumar structure has a stable value for the delay metric, meaning that the obtained results are less dependent

International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 National Level Technical Symposium On Emerging Trends in Engineering & Sciences (NLTSETE&S- 13<sup>th</sup> & 14<sup>th</sup> March 2015)



Fig 2.Experimental results for binary-to RNS converter with areas.(a) DR 4n modulo  $\{2^{n}\pm k\}$ , (b) DR 8n - modulo  $\{2^{n}\pm k\}$ , and (c) j = 4, n = 20,varying k and delays, (d) DR 4n-modulo  $\{2^{n}\pm k\}$ , (e) DR 8n- modulo  $\{2^{n}\pm k\}$ , and (f) j=4, n=20,varying k.

However, this invariance is achieved atsignificantly higher area costs, up to 18 times more area resources than Proposed II, being outside of the considered scale in Fig. 2(c).The obtained results also suggest that Proposed II structure is always faster than Soudris, in particular for smaller values of k.



Fig.3. Ratio of the energy consumption for j=4, n = 20, varying k.

For the first100 k values, the structure herein proposed achieves an average delay improvement of 48% when compared with the Soudris structure, at a cost of 36% more area resources. If the best 16 kvalues of each structure are considered, the delay of Proposed II structure ison average 2.17 times faster with just 5% additional area resources, resulting in an AT performance metric improvement of 108%. When considering the energy consumption per conversion, Proposed II structure suggests the largest variation, according to the selected k value, as shown in Fig. 3. Nevertheless, the obtained results suggest that with an adequate selection of the moduli set, (i.e., the kvalues)significantly less energy is required for the conversion. For example, for a moduli set with four channels, the Soudris and the Premkumar structures require 128% and 594% more energy, respectively, thanProposed II structure, with a modulo conversion requiring an averageof 110 nJ. The results also suggest that, if the moduli set is extended to 16 channels, the Soudris and the Premkumar structures will require 50% and 264% more energy than Proposed II structure, respectively. From this, it can be concluded that Proposed II conversion structure allows for significantly more efficient conversion metrics, in particularif the k values are adequately selected.

## V. CONCLUSION

In this brief, two generic and scalable modulo  $\{2^{n}\pm k\}$  binary-to-RNS conversion structures are proposed for jn-bit DRs. The proposed approach splits the jn input bits into j input sets, and computes the respective residue value using modular additions and constant multiplications, implementing ROMless structures. To assess the gains achieved by the proposed structures, the experimental results were obtained for 4n- and 8n-bit DRs. The obtained experimental results suggest that the proposed conversion approach allows foraverage delay improvements between 10% and 40%, with a worst average area cost between 44% and 56%, regarding the best existing state of the art. Nevertheless, if a proper selection of the used k values is performed, 2.17 times faster conversion operations with only 5% extra area resources can be achieved, with AT performance metric improvements above 100%

## ACKNOWLEDGEMENT

I express my sincere thanks to my guide Mr.S.ALI ASGAR ,M.Tech ,Assistant Professor of ECE DEPARTMENT, and to my head of departmentDr. V. THRIMURTHULU M.E., Ph.D., MIETE., MISTE.Professor & Head of ECE Department, CREC, TIRUPATHI, for their valuable guidance and useful suggestions, which helped me in the project work.

## REFERENCES

- [1] N.Szabo and R.Tanaka, Residue Arithmetic and Its Applications to Computer Technology. New York,NY,USA: McGraw-Hill, 1967.
- [2] G.Cardarilli, A.Nannarelli, and M. Re, "Residue number system for low-power DSP applications," in Proc. 41st ACSSC, 2007,pp. 1412–1416.

International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 National Level Technical Symposium On Emerging Trends in Engineering & Sciences (NLTSETE&S- 13<sup>th</sup> & 14<sup>th</sup> March 2015)

- [3] J. Bajard and L. Imbert, "A full RNS implementation of RSA," IEEE Trans. Comput., vol. 53, no. 6, pp. 769–774, Jun. 2004.
- [4] S.Antão, J.-C. Bajard, and L. Sousa, "RNS based elliptic curve point multiplication for massive parallel architectures," Comput. J., vol. 55,no. 5, pp. 629–647, 2011.
- [5] F.E.P.D. Gallaher and P.Srinivasan, "The digit parallel method for fast RNS to weighted number system conversion for specific moduli {2<sup>n</sup>-1, 2<sup>n</sup>, 2<sup>n</sup>+1}," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 44, no. 1, pp. 53–57, Jan. 1997.
- [6] P. A. Mohan, "Reverse converters for the moduli sets  $\{2^{2N}-1, 2^N, 2^{2N}+1\}$  and  $\{2^N-3, 2^N+1, 2^N-1, 2^N+3\}$ ," in Proc. SPCOM, Dec. 2004, pp. 188–192.
- [7] M.-H. Sheu, S.-H. Lin, C. Chen, and S.-W. Yang, "An efficient VLSIdesign for a residue to binary converter for general balance moduli {2<sup>n</sup>- 3, 2<sup>n</sup>+ 1, 2<sup>n</sup>- 1, 2<sup>n</sup>+ 3}," IEEE Trans. Circuits Syst., Exp.Briefs, vol. 51, no. 3, pp. 152–155, Mar. 2004.
- [8] R. Chaves and L. Sousa, "{2<sup>n</sup>+1, 2<sup>n</sup>+k, 2<sup>n</sup>-1}: A new RNS moduli set extension," in Proc. EUROMICRO Syst. Digit. Syst. Des., Sep. 2004,pp. 210–217.

# AUTHORS:



G. Sarika yadav received her B. Tech degree in Electronics & Communication Engineering from SV Engineering College for Women, Tirupati (A.P), India, in the year 2013.Currently pursuing her M. Tech degree in VLSI System Design at

Chadalawada Ramanamma Engineering College, Tirupati (A.P), India. Her area of research includes low power VLSI design.



Dr. V.Thrimurthulu M.E., Ph.D., MIETE., MISTE Professor & Head of ECE Dept. He received his Graduation in Electronics & Communication Engineering AMIETE in 1994 from Institute of Electronics & Telecommunication

Engineering, New Delhi, Post Graduation in Engineering M.E specialization in Microwaves and Radar Engineering in the year Feb, 2003, from University College of Engineering, Osmania University, Hyderabad, and his Doctorate in Philosophy Ph.D. from Central University, in the year 2012. He has done his research work on Ad-Hoc Networks.



S.Ali Asgar was born in Tirupati, Andhra Pradesh, India. He received B.Tech degree in Electronics &Instrumentation engineering from Sree Vidyanike than engineering college, Rangampeta in the year 2007. He did his M. Tech. in Digital

Signals in 2012 from SJCET, yemmiganur (A.P), India. He is currently working as Associate Professor in Chadalawada ramanamma Engineering College, Tirupati(A.P), and India.